Feature selection using nearest attributes
نویسندگان
چکیده
—Feature selection is an important problem in high-dimensional data analysis and classification. Conventional feature selection approaches focus on detecting the features based on a redundancy criterion using learning and feature searching schemes. In contrast, we present an approach that identifies the need to select features based on their discriminatory ability among classes. Area of overlap between inter-class and intra-class distances resulting from feature to feature comparison of an attribute is used as a measure of discriminatory ability of the feature. A set of nearest attributes in a pattern having the lowest area of overlap within a degree of tolerance defined by a selection threshold is selected to represent the best available discriminable features. State of the art recognition results are reported for pattern classification problems by using the proposed feature selection scheme with the nearest neighbour classifier. These results are reported with benchmark databases having high dimensional feature vectors in the problems involving images and micro array data.
منابع مشابه
Predict the Diagnosis of Heart Disease Using Feature Selection and k-Nearest Neighbor Algorithm
In this paper, the prediction of heart disease based on feature selection by using multilayer perceptron with back-propagation algorithm and k-nearest neighbor algorithm based on an explicit similarity measure with biomedical test values to diagnose heart disease is presented. The main motivation for this paper is to classify the heart disease with reduced number of attributes. We use the weigh...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملAn Improved K-Nearest Neighbor with Crow Search Algorithm for Feature Selection in Text Documents Classification
The Internet provides easy access to a kind of library resources. However, classification of documents from a large amount of data is still an issue and demands time and energy to find certain documents. Classification of similar documents in specific classes of data can reduce the time for searching the required data, particularly text documents. This is further facilitated by using Artificial...
متن کاملA Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملFeature Selection using Misclassification Counts
Dimensionality reduction of the problem space through detection and removal of variables, contributing little or not at all to classification, is able to relieve the computational load and instance acquisition effort, considering all the data attributes accessed each time around. The approach to feature selection in this paper is based on the concept of coherent accumulation of data about class...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1201.5946 شماره
صفحات -
تاریخ انتشار 2012